TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
About
We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue Generation | PersonaChat (test) | Persona Consistency0.508 | 27 | |
| Response Selection | ConvAI2 (dev) | R@1/2082.1 | 25 | |
| Personalized Dialogue Generation | PersonaChat (Human Evaluation) | Fluency3.55 | 16 | |
| Response Selection | ConvAI2 (test) | R@2080.7 | 16 | |
| Dialogue Generation | PERSONA-CHAT original (dev) | Hits@182.1 | 13 | |
| Persona-based Dialogue | ConvAI2 (test) | Hits@182.1 | 10 | |
| Knowledge-Grounded Conversation | Personalized KGC dataset (test) | BLEU-16.09 | 9 | |
| Knowledge-grounded dialog | Wizard-of-Wikipedia (WoW) (test) | BLEU18.3 | 9 | |
| Dialogue Modeling | PERSONA-CHAT (val) | Hits@182.1 | 5 | |
| Persona-based Dialogue Generation | ConvAI2 | Coherence1.83 | 5 |