| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Dialogue State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy60.76 | 105 | |
| Dialog State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy79.2 | 88 | |
| Dialogue State Tracking | MultiWOZ 2.2 (test) | Joint Goal Accuracy66.25 | 80 | |
| End-to-end task-oriented dialogue | MultiWOZ (test) | Task Success Rate94 | 68 | |
| End-to-End Task-Oriented Dialogue | MultiWOZ 2.1 (test) | BLEU Score22.19 | 57 | |
| Dialog State Tracking | MultiWOZ 2.0 (test) | Joint Goal Accuracy55.03 | 47 | |
| Dialogue State Tracking | MultiWOZ 2.4 (test) | Joint Goal Acc78.2 | 45 | |
| Task-Oriented Dialogue | MultiWOZ 2.0 (test) | Inform Rate99.1 | 37 | |
| Dialogue State Tracking | MultiWOZ 2.1 | Joint Goal Accuracy60.61 | 36 | |
| Dialogue State Tracking | MultiWOZ 2.0 (test) | Joint Goal Accuracy57.23 | 29 | |
| Response Generation | MultiWOZ (test) | BLEU Score35.1 | 27 | |
| Task-Oriented Dialogue | MultiWOZ 2.2 (test) | Inform Rate96.48 | 23 | |
| End-to-end Task-oriented Dialogue | MultiWOZ 2.0 (test) | Inform Accuracy97.5 | 22 | |
| End-to-end Dialogue Modelling | MultiWOZ 2.0 (test) | Inform Rate95.4 | 22 | |
| Knowledge-grounded Dialog Generation | MultiWOZ | Win Rate98.7 | 20 | |
| Spoken Dialogue State Tracking | MultiWOZ (test) | Joint Goal Acc32.4 | 17 | |
| Untargeted Dialogue State Extraction | MultiWOZ | State Precision42 | 15 | |
| Task-Oriented Dialogue | MultiWOZ 2.4 (test) | JGA43.8 | 15 | |
| User simulator goal alignment | MultiWOZ Challenge | Profile Success Rate (Prof.)80.1 | 14 | |
| Multi-domain Dialog | MultiWOZ | Success Rate85.3 | 13 | |
| Task-oriented dialogue | MultiWOZ 2.0 | Inform Rate91.8 | 13 | |
| Dialogue State Tracking | MultiWOZ zero-shot 2.1 | Attraction Accuracy38.86 | 11 | |
| End-to-end dialog modeling | MultiWOZ 2.0 | Inform Score94.3 | 11 | |
| Task-Focused Dialogue | Multiwoz | TSE Score0.7699 | 11 | |
| Task-Oriented Dialogue | MultiWOZ 2.1 (test) | Inform Rate99.62 | 11 |