DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
About
We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue Summarization | SamSum (test) | ROUGE-216.58 | 80 | |
| Abstractive dialogue summarization | SamSum (test) | ROUGE-L38.42 | 53 | |
| Emotional Support Conversation | ESConv (test) | BLEU-25.52 | 44 | |
| Conversational Performance | REDIAL (test) | Distinct-362.09 | 37 | |
| Conversation | INSPIRED | Distinct-22.408 | 27 | |
| Recommendation | REDIAL | R@1017.3 | 24 | |
| Paraphrase Generation | QQP (test) | BLEU-228.45 | 22 | |
| Conversational Performance | TG-REDIAL (test) | Dist-21.1881 | 21 | |
| Dialogue Generation | Douban (test) | BLEU-10.0953 | 20 | |
| Intent Recognition | OOS (test) | Overall Accuracy83.9 | 19 |