EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

About

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang• 2022

Related benchmarks

Task	Dataset	Result
Short-text generation	Weibo (test)	F1 Score12.94	6
Short-text generation	LCCC (test)	F1 Score11.75	6
Short-text generation	Douban (test)	F1 Score9.59	6
Short-text generation	Douban	Informativeness2.5	6
Short-text generation	Weibo	Informativeness2.75	6
Short-text generation	LCCC	Informativeness2.83	6
Open-domain Conversation	Chinese open-domain conversation Self-chat (test)	Coherence150.8	4

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord