| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Dialogue Response Generation | Persona-Chat | BLEU-153.3 | 20 | |
| Next Utterance Prediction | Persona-Chat (val) | Accuracy77.01 | 13 | |
| Dialogue Generation | PERSONA-CHAT Original (dev) | Hits@189.5 | 13 | |
| Response Selection | PERSONA-CHAT Revised (test) | R@182.79 | 11 | |
| Response Selection | PERSONA-CHAT Original Persona (test) | R@187.45 | 11 | |
| Dialogue Generation | PERSONA-CHAT Revised (dev) | Hits@185 | 11 | |
| Human Evaluation of Dialogue | Persona-Chat 1.0 (test) | Fluency4.31 | 9 | |
| Profile Prediction | Persona-Chat | Error Rate (Profile)1.1 | 8 | |
| Smart Reply | PERSONA-CHAT (test) | ROUGE Score7.71 | 7 | |
| Dialog utterance prediction | PERSONA-CHAT Revised v1 | Hits@10.354 | 6 | |
| Dialog utterance prediction | PERSONA-CHAT Original v1 | Hits@151.1 | 6 | |
| Dialog utterance prediction | PERSONA-CHAT No Persona v1 | Hits@10.349 | 6 | |
| Dialogue Modeling | PERSONA-CHAT (val) | Hits@182.1 | 5 | |
| Dialogue Modeling | PERSONA-CHAT (test) | F119.5 | 4 | |
| Turn-level dialogue quality evaluation (Uses Knowledge) | Persona-Chat turn-level (test) | Spearman Correlation0.6309 | 3 | |
| Turn-level dialogue quality evaluation (Interesting) | Persona-Chat turn-level (test) | Spearman Correlation0.2634 | 3 | |
| Turn-level dialogue quality evaluation (Maintains Context) | Persona-Chat turn-level (test) | Spearman Corr (Context)0.5625 | 3 | |
| Turn-level dialogue quality evaluation (Understandable) | Persona-Chat turn-level (test) | Spearman Correlation (Understandable)0.1324 | 3 | |
| Persona Perception | PERSONA-CHAT synthesized Revised (test) | Hits@178.2 | 3 | |
| Persona Perception | PERSONA-CHAT synthesized Original (test) | Hits@193.8 | 3 | |
| Dialogue Generation | PERSONA-CHAT original (dev) | Category 1 Score41.7 | 3 |